Literal2Feature: An Automatic Scalable RDF Graph Feature Extractor

نویسندگان

چکیده

The last decades have witnessed significant advancements in terms of data generation, management, and maintenance. This has resulted vast amounts becoming available a variety forms formats including RDF. As RDF is represented as graph structure, applying machine learning algorithms to extract valuable knowledge insights from them not straightforward, especially when the size enormous. Although Knowledge Graph Embedding models (KGEs) convert graphs low-dimensional vector spaces, these vectors often lack explainability. On contrary, this paper, we introduce generic, distributed, scalable software framework that capable transforming large into an explainable feature matrix. matrix can be exploited many standard algorithms. Our approach, by exploiting semantic web big technologies, able existing features deep traversing given graph. proposed open-source, well-documented, fully integrated active community project Semantic Analytics Stack (SANSA). experiments on real-world use cases disclose extracted successfully used tasks like classification clustering.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scalable RDF Graph Querying Using Cloud Computing

With the explosion of the semantic web technologies, conventional SPARQL processing tools do not scale well for large amounts of RDF data because they are designed for use on a single-machine context. Several optimization solutions combined with cloud computing technologies have been proposed to overcome these drawbacks. However, these approaches only consider the SPARQL Basic Graph Pattern pro...

متن کامل

Language Independent Feature Extractor

We propose a new customizable tool, Language Independent Feature Extractor (LIFE), which models the inherent patterns of any language and extracts relevant features of the language. There are two contributions of this work: (1) no labeled data is necessary to train LIFE (It works when a sufficient number of unlabeled documents are given), and (2) LIFE is designed to be applicable to any languag...

متن کامل

Type-based Semantic Optimization for Scalable RDF Graph Pattern Matching

Scalable query processing relies on early and aggressive determination and pruning of query-irrelevant data. Besides the traditional space-pruning techniques such as indexing, type-based optimizations that exploit integrity constraints defined on the types can be used to rewrite queries into more efficient ones. However, such optimizations are only applicable in strongly-typed data and query mo...

متن کامل

Scalable Graph Hashing with Feature Transformation

Hashing has been widely used for approximate nearest neighbor (ANN) search in big data applications because of its low storage cost and fast retrieval speed. The goal of hashing is to map the data points from the original space into a binary-code space where the similarity (neighborhood structure) in the original space is preserved. By directly exploiting the similarity to guide the hashing cod...

متن کامل

Computing the digest of an RDF graph

RDF(Resource Description Framework), graph, digest, signature, cryptography, security Efficient algorithms are needed for computing the digest of a Resource Description Framework (RDF) graph. These may be used to assign unique content-dependent identifiers and for use in digital signatures, allowing a recipient to verify that RDF was generated by a particular individual and/or has not been alte...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Studies on the semantic web

سال: 2021

ISSN: ['1868-1158']

DOI: https://doi.org/10.3233/ssw210036